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The  report  sunnarizes  the  progress  made  during  a  three  year  period  on 
research  in  data  base  management.  The  primary  effort  has  been  the  design 
development  of  a  major  data  base  management  system  of  the  relational  type, 
INGRES.  In  addition,  a  number  of  new  directions,  such  as  distributed 
data  bases  and  data  base  machines  were  Initiated. 
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DESCRIPTION  OF  RESEARCH  FINDINGS 

Under  the  support  of  the  U.S.  Army  Research  Office  the  INGRES  project 
has  made  major  advances  on  several  different  fronts  during  the  period 
June  1 ,  1976  -  September  30(  1979.  We  briefly  Indicate  each  in  turn. 

I.  Development  of  a  Usable  Relational  Data  Base  System. 

Considerable  effort  has  gone  into  producing  a  relational  data  base 
system  that  can  be  used  by  others  to  test  the  viability  of  the  relational 
approach.  This  has  included  making  INGRES  acceptably  fast,  reasonably  free 
from  bugs  and  containing  needed  services  such  as  crash  recovery,  concurrency 
control  and  protection.  It  also  included  producing  good  user  level  docu¬ 
mentation. 

To  date  there  are  about  130  INGRES  installations  around  the  world 
including  several  military  ones.  These  are  Installations  at  the  National 
Security  Agency,  the  Army  Computer  Systems  Command,  the  Navy  Ocean  Systems 
Center,  the  Navy  Personnel  Data  Center,  the  Department  of  Defense,  the  Air 
Force  Data  Services  Center,  the  Navy  Personnel  Research  and  Development  Center, 
the  Naval  Postgraduate  School,  the  Army  Research  Office  -  AIRMICS,  and  the 
Defense  Advanced  Research  Projects  Agency 

It  is  our  conclusion  that  the  development  of  INGRES  as  a  one  machine 
system  has  had  considerable  Impact.  However,  there  is  little  reason  to 
pursue  this  development  further;  that  should  be  left  to  commercial  venture. 

A  summary  and  critique  of  the  design  experience  is  contained  in  [STON  79). 

II.  Distributed  Data  Bases 

In  late  1976  we  embarked  on  extending  INGRES  to  function  on  multiple 
computer  systems  loosely  coupled  together  by  a  communications  network.  The 
problems  to  be  overcome  included: 

1)  storage  and  redundancy  of  system  catalogs 
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2)  query  processing  in  a  distributed  environment 

3)  concurrency  control 

4)  crash  recovery 

5)  control  of  multiple  copies  of  data 

Initial  thoughts  in  these  areas  were  presented  in  [STON  77]  and  more 
recent  results  in  [STON  79a,  EPST  78].  We  have  embarked  on  building  a  proto¬ 
type  distributed  data  base  system,  and  it  is  nearing  completion  at  the  current 
time.  It  should  be  mentioned  that  most  of  the  difficulties  we  have  encountered 
concern  the  communications  system  and  the  operating  system  software  to  support 
it;  the  data  base  code  has  been  relatively  straightforward.  As  such  no 
performance  data  is  currently  available  but  we  hope  to  have  some  soon. 

III.  Data  Base  Machines. 

In  1977  we  began  investigating  the  feasibility  of  a  data  base  machine. 
Moreover,  we  examined  all  the  existing  proposals  and  discarded  them  as  not 
particularly  viable  as  a  mechanism  to  increase  the  speed  of  our  existing 
software.  Hence  we  proposed  a  design  in  1978  [STON  78]  and  a  new  design 
in  1979  [STON  79b].  The  latter  design  is  in  the  process  of  study  refinement 
and  initial  coding  is  commencing. 

The  basic  point  we  are  exploiting  is  that  data  base  machines  are  really 
nothing  other  than  distributed  data  base  systems  from  a  software  point  of 
view.  Rather  increased  speed  is  obtained  by  a  fast  network  and  specialization 
of  function. 

IV.  Data  Base  Design. 

It  has  become  evident  that  designing  data  bases  (regardless  of  what 
data  base  system  is  available)  is  very  hard.  Moreover,  doing  designs  in 
such  a  way  that  migration  from  CODASYL  network  system  to  relational  systems 
or  vice  versa  is  harder  yet.  Consequently,  we  developed  a  representation 
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scheme  and  a  set  of  rules  for  this  scheme  that  preserve  the  possibility  of 
mapping  to  either  CODAS YL  or  Relational  systems.  This  set  of  rules  also 
guarantees  that  many  of  the  ambiguities  that  can  happen  with  poor  designs 
are  avoided  completely.  This  work  is  reported  in  [KATZ  79]. 

V.  Concurrency  Control 

One  of  the  key  problems  in  a  data  base  management  system  is  to  control 
multiple,  concurrent,  possibly  conflicting  updates.  This  problem  is  invari¬ 
ably  solved  by  creating  some  lockable  object  in  the  data  base  and  then 
establishing  a  lock  management  protocol.  However,  the  problem  remains: 

"How  large  should  this  lockable  object  be?"  If  it  is  chosen  too  small, 
there  is  excessive  overhead  getting  and  releasing  locks.  Alternately,  if 
it  is  chosen  too  large,  possible  parallelism  is  not  exploited. 

In  [RIES  77,  RIES  79]  we  have  completely  answered  this  question. 

VI.  Programming  Language  Interface  to  a  Data  Base  System 

We  have  investigated  how  a  data  manipulation  language  should  be 
embedded  in  a  general  purpose  programming  language  [STON  77b]  and  some  of 
the  issues  in  designing  a  new  programming  language  oriented  toward  data 
manipulation  [PREN  78]. 
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